Outage-based ergodic link adaptation 
for fading channels with delayed CSIT 

Jung Hyim Bae, Jimgwon Lee, Inyup Kang 
Mobile Solutions Lab 
Samsung US R&D Center 
San Diego, CA, USA 
Email: jungbae@umich.edu, jungwon@alumni.stanford.edu, 
inyup.kang @ samsung.com 



Abstract 

Link adaptation in which the transmission data rate is dynamically adjusted according to channel 
variation is often used to deal with time-varying nature of wireless channel. When channel state 
information at the transmitter (CSIT) is delayed by more than channel coherence time due to feedback 
delay, however, the effect of link adaptation can possibly be taken away if this delay is not taken into 
account. One way to deal with such delay is to predict current channel quality given available observation, 
but this would inevitably result in prediction error. In this paper, an algorithm with different view point 
is proposed. By using conditional cdf of current channel given observation, outage probability can be 
computed for each value of transmission rate R. By assuming that the transmission block error rate 
(BLER) is dominated by outage probability, the expected throughput can also be computed, and R can 
be determined to maximize it. The proposed scheme is designed to be optimal if channel has ergodicity, 
and it is shown to considerably outperform conventional schemes in a Rayleigh fading channel model. 

I. Introduction 

A time-varying nature is one of the most important properties of wireless channel. To en- 
sure efficient and reliable communication in time-varying fading channel, the current wireless 
standards support link adaptation in which the transmission data rate is dynamically adjusted 
according to channel variation 01 Ch. 10]. An apparent challenge with link adaptation is delay 
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of channel state information at the transmitter (CSIT). If the receiver observes channel at some 
point and feeds it back to the transmitter, then CSIT would inevitably suffer from delay which 
can be several transmission blocksQ Unless channel coherence time is larger than this delay, 
CSIT delay can result in incorrect CSIT which possibly take away most of benefits of link 
adaptation. 

One way of dealing with CSIT delay is to predict current channel quality given delayed 
observation (feedback). When channel statistics is known, the optimal predictor which minimizes 
mean-square-error can be computed by using conditional mean [[2], Thm. 3.A.1]. In practice, linear 
predictors which are also optimal when channel is Gaussian distributed are often used, and 0- 
[[51 show several examples of this approach. One thing to note is that prediction of current 
channel can never be perfect due to the random nature of current channel, i.e., current channel is 
a random variable conditioned on observation. With the approach of predicting current channel, 
this random nature is source of incorrect CSIT and is undesirable but unavoidable. 

In this paper, we propose a scheme with completely different view point from predicting 
channel quality. In essence, the proposed scheme finds the transmission data rate which maxi- 
mizes the expected throughput by using the cdf of current channel conditioned on observation. 
Since it does not attempt to predict channel quality, it is obviously free from error of such 
prediction. This does not mean that every transmission of the proposed scheme would result in 
successful decoding since the determined data rate may correspond to non-zero error rate of the 
transmitted code block, i.e., block error rate (BLER). Instead, it determines the 'best' BLER 
which maximizes the throughput. If ergodicity holds for a random process defined by joint 
distribution of channels, then the proposed scheme maximizes the actual long-term throughput, 
and hence, it is optimal over all possible link adaptation schemes. Such assumption of ergodicity 
is thought to hold in many models as discussed in [6, Ch. 5], and numerical evaluation for a 
specific model will be seen later in this paper. Although performance of the proposed scheme 
is guaranteed by the aforementioned optimality, it will also be seen that simulation results show 
that the proposed scheme significantly outperforms conventional ones. 

An important assumption used to derive the proposed scheme is that code block error is 

'in current wireless standards Q] Ch. 10], rate determination is done at the receiver side, and the rate request to the transmitter 
is delayed. In this paper, we assume that rate determination is done at the transmitter side with delayed CSIT. Note that these 
two scenarios are equivalent. 
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resulted from an outage event. An outage event in which the channel quality does not support 
the transmission data rate is often used to analyze performance in slow fading channels with 
no CSIT [[61 Ch. 5], and outage probability is determined by the transmission data rate and 
the cdf of channel. The situation with delayed CSIT is very similar to no CSIT case, and the 
only difference is that the cdf of channel varies depending on observation. If channel coding is 
optimal with correct CSIT, then the expected BLER must be determined by outage probability 
with delayed CSIT. Since this is not true in any practical scenario, outage probability alone 
does not determine the expected BLER, but such assumption is still quite realistic given the fact 
that practical channel coding is near optimal and major source of error is lack of CSIT rather 
than sub-optimal coding when CSIT is delayed. Once outage probability is found, the expected 
throughput of the transmission data rate can also be found by using the above assumption of the 
expected BLER being determined by outage probability, and hence, the rate which maximizes 
the throughput can be determined. 

The remainder of the paper is organized as follows. Section [TT] describes channel model used 
in this paper. Section [Till describes the proposed scheme in detail and presents performance 
evaluation as well as numerical evaluation of ergodicity which is closely related to optimality 
of the proposed scheme. Section [TV] discusses possible generalizations of the proposed scheme 
to more complicated cases, and Section [V] concludes the paper. 

II. Channel model 

Consider the following single-input single-output (SISO) block flat fading channel model 
corresponding to the nth symbol of the zth code block with channel output yi[n] at the receiver, 
channel input Xi[n] from the transmitter, background noise 2*[n], and the channel hi. 

yi[n] = yfPhiXi[n} + Zi[n] for n = 1, N, (1) 

where N is the code block length We assume that background noise Zi[n] ~ £/V(0, 1) is iid. 
{hi}^ 1 is Rayleigh fading process which means that hi ~ £A/"(0, 1), and P represents signal-to- 
noise ratio (SNR). We assume that {h i ]°^ l is jointly Gaussian. One example of such process can 
be found in Clarke's model [6, Ch. 2] which satisfies E[hih* +l ] = Jo(2nf D lT b ) for all integer 
I where fo is Doppler frequency, T& is code block duration, and J is a zeroth-order Bessel 
function of the first kind. 
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For the code block i, we assume that the maximum achievable transmission data rate is the 
same as the capacity of hi which is 

Q = log 2 (l + P|/i,,| 2 ). (2) 

Note that this assumption is not true unless length iV of each code block approaches infinity. 
For the ease of exposition, however, we will keep this assumption throughout the paper. We also 
assume that CSIT delay is l d code block duration throughout the paper, i.e., CSIT available at 
the ith code block corresponds to the channel at the (i — /^)th code block. 

Although we are assuming one of the simplest time-varying fading channel model here, the 
proposed scheme can be generalized to more complicated cases such as with multiple antennas 
and/or with fading within a code block. These generalizations will be discussed in section [IV] 

III. Outage-based ergodic link adaptation 

A. Description of the scheme 

Consider the zth code block. This code block would result in error if and only if the attempted 
transmission data rate R is greater than Cj, i.e., an outage occurs. As mentioned earlier, hi can 
be viewed as a random variable conditioned on available observations when there is CSIT delay, 
and hence, outage probability can be computed for each value of R. This outage probability can 
be used to compute the expected throughput at the ith code block. To compute outage probability, 
we first need to identify distribution of hi conditioned on observation. To do that, we need the 
following lemma. 

Lemma 1. [22 Chap. 3.C] If [X T , Y T ] T is a Gaussian random vector with real numbers, then 
given Y = y, 

X ^N(E[X\Y = y},C xlY ), (3) 

where Cx\y = Cx — ACyx, E[X\Y — y] — A{y — my) + mx, A solves ACy = Cxy, 
m x = E[X], Cxy = E[(X - m x )(Y - m Y ) T ], and C x = C XX - 

Let h b s (i) be the vector of channel observations used to determine the rate for the ith code 
block. Let a re be real part of a, and a im be imaginary part of a for a complex scalar or vector 
a. Then, we can set X = {hi^h iM ) T and Y = (h^ bs{i ^ re ,h^ bs(i)im ) T , and apply Lemma [I] to 
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find the conditional distribution of h^. To simplify analysis, we consider h obs (i) = h^i d which 
means only single observation is used. Note that the number of channel observations in h obs ^ 
only affects calculation of = y] and C x \y, and the following analysis directly carries 

over to any number of observations in h obs ^y 

With restriction of single observation, the only thing which needs to be specified is correlation 
between hi and hi~i d . Let C = E[hih*_ t ]. In Clarke's model, C = Jq(2-k f D l d T b ). Given hi_i d , 

h-CNiCh^l-C 2 ). (4) 

Outage probability P out (i?) of the ith code block for given R can be expressed as 

Pi,out(R) = P{Q<R\h t _ ld } (5a) 



2 R -1 



hi-iA. (5b) 



P 

Note that |/ij| 2 given hi-i d has noncentral chi-squared distribution with 2 degrees of freedom, 
and its cdf is given as 



/mi 2 < *i w = i - Qi (v^# Vr^) ' (6) 

where Q\ is a Marcum Q-function defined as 

f°° / x 2 + a 2 \ 
Qi(a,b) = J xexp(^ — Jl (ax)dx, (7) 

where Iq is a modified Bessel function of order 0. Therefore, 



i-c 2 ' y p(i -c 2 ))' 

We can also express the expected throughput TP^R) of the zth code block for given R as 

TP(R) = R{1 - P hOUt {R)) (9a) 



The proposed scheme is to choose R which is the solution of m&x R TPi(R) as the transmission 
data rate of the 2th code block. It can be seen that TPi(R) becomes the same as the case where 
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\hi\ is Rayleigh distributed if C = 0. In other words, if channel is uncorrelated, then delayed 
CSIT fails to give any information about current channel. Hence, the proposed scheme as well 
as conventional methods of predicting channel would be beneficial only if there is considerable 
correlation in the channel. Performances of the proposed scheme with respect to various channel 
correlations will be presented later in this paper. 

As mentioned earlier, an important aspect of the proposed scheme is getting rid of the process 
of predicting current channel. Validity of this approach lies in ergodicity of the channel. It can 
be easily seen that the transmitter allocates the same rate for code block i's with which |/ij_;J's 
are the same. If we only consider the actual throughput of code blocks with those z's, then it 
must converge to max^ TPi(R) given ergodicity of the channel, and the proposed scheme indeed 
maximizes the actual long-term throughput. 



B. Determination of search interval 

TPi(R) given in © involves Marcum Q-function which does not have a closed form expres- 
sion, and this implies that the solution of m&x R TPi(R) cannot be found analytically. In the case 
when TPi(R) is convex, well known exiting algorithms can be applied to find the solution of 
maxjiTPi(R) 0. To the authors' best knowledge, however, TPi(R) is not known or can be 
proven to be convex, and hence, we consider brute-force search. Since possible values of R are 
any positive numbers, the optimal search for the solution of max R TPi(R) can be prohibitively 
difficult. To reduce complexity of search, we may first transform the original continuous search 
problem to a discrete search problem by quantizing search space of R which is non-negative 
real line. This would result in loss of optimality, but even after that, complexity of search would 
still be prohibitive unless finite search interval is established. In this section, we will discuss of 
ways to establish such interval. 

Since we need to make sure that the solution of the original optimization problem max fi TP^R) 
belongs to the reduced search interval [R L , Ru], R L and Ru must satisfy the following conditions. 
First, > for all R < R L . Second, < for all R> R v . Hence, our objective is to 

find hopefully large Ru and small Rl satisfying these conditions. The following theorem gives 

/2C|/l;_; I 2 

the interval [Rl, Ru] which the solution of maxR TPi(R) always belongs to. Let a — ■ 



i-c 2 



P = V^JSy and rt R ) = v 7 ^!. 
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Theorem 1. The solution of max^T 'Pi(R) always belongs to the interval [Rl, Rjj] where Ru = 
m&x{R, log 2 (|^ + 1)} and Rl = R R is the unique solution of R2 R ~ 1 = , and R is the 

unique solution of R2 R ~ 1 



/3 2 ln2- 

Proof: TPi(R) can be written as follows. 

r-OO , 2 I 2 



f°° / x 2 + a 2 \ 
TPi(R) = R xexp )l (ax)dx (10a) 

/ f M(R) / x 2 + a 2 \ \ 
= R y-~J xexp[ Jl {ax)dx). (10b) 



Hence, 

^ = Q l(a , Pl( R)) - *^gW*) exp ( - ^ (R l + " 2 ) 7,(^X1 la) 

B2 R \n2 / B 2 ^/(R) 2 + <r 2 \ 

= 07(A)) - A^T^/W) exp ( - P )/ (a/3 7 (fi)Xllb) 

= Q x (a, fo{R)) - ^2 R - 1 (ln2)/3 2 exp ( - + q2 )/ M 7 (A)). (He) 

The above expression still involves Qi(-) and J (-) which do not have closed form expressions. 
Hence, we consider bounds on these functions to obtain an analytical solution on the search 
interval. 

To find Ru, let us first upper bound by upper bounding Qi(-). Bounds on Marcum 

Q-function have been studied considerably as seen in {SJ and references therein. In j8]|, the upper 
bound of Qi(a, /3j(R)) is given for f3y(R) > a as 

(12) 

which is shown to be quite tight by numerical evaluation in [8j. Here, erfc(x) = ^= e~ t2 dt, 
x > 0, and the fact that it does not have a closed form expression again becomes problematic. 
To proceed further, we need to upper bound erfc(x). To simplify analysis, we consider a upper 
bound of erfc(x) which consists of a single exponential term. Well known Chernoff bound 
which is given as erfc(:r) < 2exp(— x 2 ) O is an example of such bound. In 0, it is shown 
that exp(— x 2 ) is the best upper bound of erfc(x) among general single-term exponential-type 
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bounds aexp(— bx 2 ), where a, b > 0. By using this, we get 

Q l(Q , W fl)) < /o(^ 7 ( fl ))(exp (- W + " 2 ) +a y|exp (- ^ + ^ (13) 
By upper bounding Q\(a, (3^(R)), we get 

™ < u<#*x»** ( - M±-) (i « ln2 ). ( i4) 

It can be seen that f?£/ can be found using the above expression without further bounding Iq{-). 
Indeed, we can say that dTR ^ < if R > R where R is the unique solution of R2 R ~ 1 = 

A 2 

Then, Ru = max{i?, log 2 (|2 + !)}• 

To find R L , we need to lower bound TPi(R) by lower bounding Q±(a, f3^(R)). A tight lower 
bound is also derived in (HI, but its complicated expression makes it hard to analyze, and hence, 
we use another lower bound given in [JSj which turns out to be looser by numerical evaluation 
in [SI. The lower bound has the following expression. 

Qi(a,fr(R)) > Jo(a/? 7 0R))exp ( - + - ). (15) 

Then, 

dTP ' W > M^7(«))exp ( - ^ + ° 2 ) (l - R 2^2). (16) 



dR 

We can say that dTP ^ R) > if R < R where R is the unique solution of R2 R ~ 1 = and 



R L = R. ■ 

What one ideally want from the interval [Rl,Ru] would be dependence of Ru and Rl on 
observations. It can be seen from Theorem [Q however, only Ru depends on a. This means that 
the interval [Rl,Ru] ma Y n °t be very meaningful. Even with Ru, upper bounding erfc(-) with 
a single exponential term could result in loose upper bound. To check validity of the interval 
[Rl, Ru] in the actual channel model, we consider the following first order autoregressive channel 
model which is also called AR(1) model. 

• Simulation channel model 

ho = h temp fi, (17a) 
hi = Vl - C 2 h temPti + Chi-i, i = l,2,... (17b) 
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TABLE I 

THE RATIO OF THE SEARCH INTERVAL [R L ,Ru] TO log 2 (l + P) 



(dB) 

Correlation ~~ — — 


0(1) 


4 (1.81) 


8 (2.87) 


12 (4.07) 


16 (5.35) 


20 (6.66) 


0.1 


0.22 


0.16 


0.12 


0.10 


0.08 


0.07 


0.7 


0.53 


0.42 


0.36 


0.31 


0.28 


0.26 


0.9 


0.67 


0.59 


0.55 


0.49 


0.44 


0.43 


0.95 


0.75 


0.69 


0.67 


0.60 


0.54 


0.51 



TABLE II 

Empirical probability of Ru = R 



~~~~--~~~~~^_SNR (dB) 
Correlation ~ — — 





4 


8 


12 


16 


20 


0.1 


1 


1 


1 


1 


1 


1 


0.7 


0.86 


0.77 


0.67 


0.56 


0.45 


0.38 


0.9 


0.50 


0.42 


0.35 


0.27 


0.22 


0.18 


0.95 


0.33 


0.27 


0.23 


0.17 


0.14 


0.11 



where h temp ,i ~ CJ\f(0, 1) and iid. 
The above model has the following properties. 

E[hih*\ = 1-C 2l ^>l (18a) 
E[h t h*^] = C(l - C 2i ) ^> C. (18b) 

Hence, the above channel model is the same as the one considered in Section IIII-AI with 
sufficiently large i if we assume Id = 1 without loss of generality. 

Table U shows the ratio of the empirical average of Rjj — Rl to log 2 (l + P) over 10 4 realizations 
of hi with respect to various P and C. The value of log 2 (l + P) which can be thought as the 
capacity of the average channel is given as reference in the bracket right next to each value of 
P. It can be seen from Table H that the ratio of Ru — Rl to log 2 (l + P) decreases as P increases, 
and it is more dramatic with low correlation. Ru is determined by maximum of two terms as 
in Theorem [Q and which term being active affects the results given in Table |U Table [U shows 
the empirical probability of R being active for Ru over 10 4 realizations of hi. It can be seen 
that R is active mostly for low correlation. If R v = R, then R'2 Ru - R! v 2 Rl = Sr^|, which 
implies that R v — Rl would increases as P increases. The rate of this increase, however, would 
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be slower than that of increase for log 2 (l + P) due to the multiplicative factor R v and Rl in 

a fit 

R'2 Ru — R'jj2 Rl = -3^2 ' an( * ^ tums out to be much slower as can be seen in Table U for low 
correlation case. 

High correlation cases show slightly different trends, and it may imply looseness of the interval 
[Rl, Ru]- Intuitively, it would be more likely that the realization of the next channel instance is 
close to the observation when correlation is high. In that case, search interval must be smaller 
than low correlation cases, but results in Table U actually show the opposite. Looseness of Ru is 
possibly a reason for this phenomenon. Note that the upper bound of Marcum Q-function which 
is used in the proof of Theorem \T\ is valid only for P'j(R) > a. Because of this, Ru is expressed 
as maximum of two terms, and Ru is likely loose if Ru = log 2 (f? + 1)- Table HI shows that it 

2 

is more likely that R v = \og 2 (^ + 1) when correlation is high, and this can be another reason 
why the interval [Rl, Ru] does not become small with high correlation. Such possible looseness 
of the interval [R L , Ru] can be troublesome especially given the fact that the proposed scheme 
would likely be meaningful only for sufficiently high correlation, and this means that further 
optimization of the interval [Rl, Ru] may be needed. Nevertheless, absolute size of the interval 
[Rl,Ru] is not impractically large according to Table HI and hence, performance evaluation of 
the proposed scheme in this paper will be based on the search interval in Theorem [TJ 

C. Performance evaluation 

To evaluate performance of the proposed scheme, we consider the following conventional 
schemes to compare with. 

1) Scheme 1 (A scheme which ignores observation) 

Since this scheme ignores observation, the transmission data rate of this scheme is fixed 
throughout whole transmission. This scheme treats each channel instance as a random 
variable which is circularly symmetric white Gaussian. Then, similar to the proposed 
scheme, the rate R\ of this scheme is determined to maximize the expected throughput 
at each instance, and it is the solution of maxjj T~P(R), where T~P{R) = i?exp(— ^-p^-). 

2) Scheme 2 (A scheme which predicts the current channel based on observation) 

It would be natural to consider a method which minimizes mean square error. The best 
prediction hi of hi in that sense is hi = E[hi\h b s u)\. If we consider the channel model 



1 1 

in (fTTT ), then hi = E[hi\hi-i] = C7ij_i. Then, the transmission data rate i? 2 ,j must be the 
capacity of the predicted channel which is R 2 ^ = log 2 (l + C 2 |/ij_i| 2 ). 
It can be seen that R\ would be the allocated rate of the proposed scheme when the channel 
correlation C — 0, i.e., a = 0. In fact, the channel is ergodic when C = 0, and hence, both the 
proposed scheme and Scheme 1 are optimal in terms of long-term throughput. This also means 
that performances of these two schemes would not be much different when channel correlation 
is not high enough. Furthermore, it can be seen from Theorem Q] that Ru = Rl when a = 0. 
Therefore, Ri = R v where Ru is the solution of R2 R = p^. 

As described above, Scheme 1 and Scheme 2 have closed form solutions for the allocated 
rate, and hence, only the proposed scheme requires search of the allocated rate. We consider 
search in the interval [R L , Ru] with 100 evenly distributed points. Figure [JJ shows the throughput 
results for various correlation values C. 10 4 channel realizations are considered with the model 
in (fTTT ) and the assumption of block error resulted from outage event is used for results in 
Figure [JJ It can be seen that the proposed scheme shows almost identical performance to Scheme 
1 which ignores observation with low correlation as expected. As correlation increases, the 
performance gap between these two gets larger. Interestingly, Scheme 2 performs very poorly 
with low correlation, and it does not outperform Scheme 1 even with C = 0.95. This is probably 
because of the restriction of one observation, and it can be different with more observations. It is 
clear, however, that the proposed scheme performs significantly better than a scheme which tries 
to predict current channel. Even with more observations, the proposed scheme should benefit 
from it as well, and hence, performance gap would likely remain. 

D. Evaluation of ergodicity 

Although the proposed scheme outperforms the conventional ones as described in Figure [Q 
channel correlation needs to be sufficiently high for that to happen. If the proposed scheme 
does not achieve the optimal performance, then there may be a room for improvement. As 
mentioned earlier, the proposed scheme is optimal if the channel has ergodic property. Consider 
the code block z's which have the same value of |/i$_j d |'s. Optimality of the proposed scheme is 
determined by whether the actual throughput of such code blocks converges to max R TPi(R). 
Let Ia(-) is an indicator function. In other words, Ia(x) — 1 if x G A and Ia(x) = 
otherwise. We can think of Rl {h i :iog 2 (i+p\h i \ 2 )>R}{hi) as a random variable for z's with \hi-i d \ = x 
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(a) C* = 0.1 



Correlation^. 9 
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(c) C = 0.9 

Fig. 1. Evaluated throughput for various correlations 
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(b) C = 0.7 
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(d) C = 0.95 



where x > 0. Then, the statistical average of this random variable would be TPi(R) given 
\hi-i d \ = x. We want this statistical average to be equal to the limit of empirical average 
given as lim^oo - ^ yr 1 — . The following is the formal 

^i=J d +l J {^_j d :|^_J d l=*}(' l <-I < ,; 

definition of this ergodicity. 



Definition 1. A Rayleigh fading process {hi}f =1 where hi 's are jointly Gaussian is called weakly 
TP-ergodic if it satisfies the following for R, x > with probability 1. 



tp /d\_t - R ^L d +i / {fe»:iog 2 (i+-P|^l 2 )>fi}(^) / {fe i -i d ^ I -iJ=^}(^-^) nQ , 
l{ ' y T hh .. . x ( h . , ) ' uyj 
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where TPi(R) is computed for \hi_i d \ = x. 

We can also think of pseudo ergodicity which is closely related to weak TP-ergodicity and 
has better operational implication. 

Definition 2. A Rayleigh fading process {hi}f =1 where hi's are jointly Gaussian is called pseudo 
weakly TP-ergodic if it satisfies the following for R>0 with probability 1. 

t 1 ™ ^ =ld+1 T — = j™, f / {^:iog 2 (i+P|/ ll | 2 )>«}(^)- (2°) 

In fact, the proposed scheme is optimal as long as it satisfies the above pseudo weak TP- 
ergodicity. Let us now look at the relationship between two ergodicity conditions. 

Lemma 2. Weak TP-ergodicity implies pseudo weak TP-ergodicity if 

R YlJ=l d +l ^ l :log 2 (l+P|? il | 2 )>iJ}(/ii)^{/i 1 -; rf :|/ii- i d \=x}(hi-l d ) 

hm 



Si=Z d +l I{hi- ld :\hi-i d \=x}(hi- 



limr^oo I Y^=l d +1 / {^:log 2 (l+P|h i |2)>R}(^)^{h l _ id :|h i _ i J=x}(/l 



i-L 



(21) 



liniT^oo T J2i=i d +i 1 {h i -i d ^h i -i d \=x}{h i -i d ) 

Proof: Assume that weak TP-ergodicity holds. If the assumption (1271) holds, then weak 
TP-ergodicity can be written as 

T 



i=i d +i 

T 



R 

n™ T E / {^:l0g 2 (l + P|^| 2 )>«}(^) J {^- id ^i- i J=^}( /l *-^)- 



i=Jd+l 



Consequently, we get 

T 



T 

= / T lim o ^ E / {^:log 2 (l+P|fti| 2 )>-R}(^) J {^-! d ^i-! d l=^}(^-'J rfX - ( 23 ^ 

t/ X ■ I I -| 



i=ld + l 



Since T J2 i= i d+ i Tp i( R ) I {h i -i d ^h i -i i \=x}(.h i -i d ) and ^ J2i=i d +i / {/ ll :iog 2 (i+P|/ ll | 2 )>«}( /l ') / {^-i d ^i- ! J=^}( /l i-'J 
are upper bounded by R, limit and integration can be exchanged from dominated convergence 
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theorem IfTOTl . Then, we have 

EL +i TP i( R ) R 
lim d — = hm - ^ I{ hi .Ao g2 {i+p\h^)>R}{hi), (24) 

1 — ^oo 1 1 — >oo 1 — 

which proves the claim. ■ 
Pseudo weak TP-ergodicity can be checked numerically by comparing the empirical average 
of maxjiTPi(R) (here, the observation |/ii_j d | is random) with the actual throughput. We can 
also think of stronger ergodicity as follows. 

Definition 3. A Rayleigh fading process {hi}J =1 where hi 's are jointly Gaussian is called strongly 
TP-ergodic if it satisfies the following for all R with probability 1. 

R T 

E hi _ ld \TPi{R)\ = J {^iog 2 (i+P|M 2 )>fi}(^)- (25) 

i=ld+l 

The strong TP-ergodicity says that the actual throughput converges to statistical average 
of maxfl TPi(R) with respect to marginal distribution of hi. Strong TP-ergodicity requires 
Efn_ t [TPi(R)] = liniT-5.00 ' = ' d+ y — - — in addition to pseudo weak TP-ergodicity. Note that 
this strong TP-ergodicity does not need to hold for the proposed scheme to be optimal. Figure [2] 
describes the actual throughput, the empirical average of max fl TPi(R), and the statistical average 
of max R TPi(R) with various correlations using 10 4 channel realizations with the model in (TT71) . 
It can be seen that they are very close for all channel correlations, which hints that the both 
strong and pseudo weak TP-ergodicity conditions hold, and that the proposed scheme is near 
optimal in terms of long-term throughput. 

IV. Generalization of the proposed scheme 

The proposed scheme considered in Section [III] is defined only with SISO block flat fading 
model given in As mentioned earlier, it can be generalized to more complicated cases, and 
this section discusses such generalization. 

A. SIMO 

Consider the following single-input multi-output (SIMO) block flat fading model. 

yjn] = VPhiXiln] + zjn] for n = 1, N, (26) 
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(c) C = 0.! 



(d) C = 0.95 



Fig. 2. Evaluation of ergodicity for various correlations 



where is a vector of length M where M is the number of receive antennas. There are 
M elements in and they are independent with each other. Also, M elements of zjn] are 
independent with each other. Let h^ m be the mth element of and z ijm [n\ be the mth element 
of 2j[n]. Similar to the SISO case, we assume that background noise z i)m [n] ~ CjV(0, 1) is iid. 
{/ij,m}£i is Rayleigh fading process which means that h^ m ~ CA/"(0, 1), and P represents SNR. 
We assume that {^i, m }^i is jointly Gaussian. The proposed scheme can directly be extended to 
this case. The capacity of the zth code block Cj can be expressed as 

a = log 2 (l + P||^|| 2 ). (27) 
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With restriction of single observation, let C = E[hi >m h*_ l: ]. Given /ij_; d m , 

h i>m ^m(Ch^ ldim ,l-C 2 ). (28) 
The outage probability P out (R) of the zth code block for given R can be expressed as 

Pi,out(R) = P{Q<R\h^ ld } (29a) 



P \\hA\ 2 < 



2 R -1 



tki-iA- ( 2 9b) 



P 

It can be seen that \\hi\\ 2 given h.i_i d has noncentral chi-square distribution with 2M degrees of 
freedom, whose cdf is given as 



P{\k\\ 2 < *| W = 1 - Qm (J "-^f, sj^) , (30) 

where Q M is a Marcum Q-function defined as 

f°° x M ( x 2 + a 2 \ 
Q M (a,b) = J ^-jexp^ — )I M -i(ax)dx, (31) 

where Im-i is a modified Bessel function of order M — 1. Therefore, the expected throughput 
TPi(R) of the ith code block for given i? as 

TPi(R) = R{1 - P i>out {R)) (32a) 



Then, the proposed scheme is to choose R which is the solution of maxjj TPi(R) as the 
transmission data rate of the zth code block. Similar to SISO case, the closed form solution 
cannot be found, which means that determination of search interval as in Theorem Q] must be 
done to implement the scheme. As for Qi(-), there are several results which consider bounds 
on general Marcum Q-function Qm(), and they may be used to determine such interval. 

B. Fading within a code block 

Consider now the following SISO flat fading model. 



yi[n] = V Phi[n]xi[n] + Zi[n] for n = 1, ...,N, (33) 
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We assume that background noise Zi[n] ~ CM(0, 1) is iid. /i^A 7 ]}?^ is Rayleigh 

fading process which means that hi[n] ~ C/V(0, 1), and P represents SNR. We assume that 
hilN}}^ is jointly Gaussian. The capacity Cj of the ith code block can be calculated 

as 

C % = max lim —I(X N ;Y N ). (34) 

PpfJV) JV->oo N 



Note that 



;(I w ;r) = h(Y ) — h(Y \X ) (35a) 

N 

( = } ^(y^-^^ixo (35b) 

n=l 

/V JV 

< ^/i(y n )-^My n |x n ) (35c) 



n=l n=l 



^/(X n ;K„), (35d) 



n=l 

where (a) comes from the fact that background noise is iid. Therefore, 

1 - 

C * ^ ,! im vJ2 lo &( 1 + P \ h M 2 )- (36) 

n=\ 

The equality of the above expression is achieved when y[n] is independent, and this depends 
on the joint statistics of the channel. Since it would be difficult to derive the exact capacity C, 
in this case, we consider the upper bound J2n=i 1°S2(1 + -P|^iNI 2 ) as me approximation of 
the capacity for finite N. The conditional probability of hi[l], ^[A^] given observation can be 
expressed by using LemmaCQas in Section UlI-AI given that correlation among all involved random 
variables is specified. The expression of the outage probability and the expected throughput, 
however, would be much more complicated than those in Seciton IIII-AI or Section HV-Al Hence, 
we consider furhter simplified expression as follows. 

P N 

a~log 2 (l + ^5><N| 2 )- (37) 

n=l 

Note that we are further upper bounding the capacity by doing this because of concavity of log 
function, which implies that using this will underestimates outage probability. Let us assume 
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again a single code block of observation. Let hi[N]) T be We first interested in the 

distribution of given . By using Lemma [U 

h^CNiAh^Cs-AC?), (38) 

where A = C X C^ X , C x = £[Z^J> C 2 = E^Jif^], and C 3 = E^hf). Since C 3 - AC* 
may not be a diagonal matrix, ^2^=i \ hi[n]\ 2 given Ki_i has generalized chi-squared distribution 
which has no known closed functional form. If we assume that C 3 — AC^ is diagonal, then 
J2n=i \hi[ n ]\ 2 gi yen hi-i d nas noncentral chi-squared distribution with 2N degrees of freedom, 
and the resulting outage probability and the expected throughput can be derived as in Sec- 
tion EEAl 

C. HARQ 

The current wireless standard supports hybrid automatic repeat request (HARQ) to improve 
reliability of transmission. Under HARQ, the received signal at each transmission is not discarded 
even with incorrect decoding, and the retransmitted signal is combined with stored previous sig- 
nals. There are two prevalent combining methods called Chase combining (CC) and incremental 
redundancy (IR). Detailed description of them can be found in ifTTl for CC and in lfT2l for IR. 

Let us consider a single retransmission for simplicity. We also consider SISO block flat 
fading model given in (OQ) with a single code block observation. The transmitter determines the 
transmission data rate of code block i with hi^ ld . Assume that possible retransmission occurs at 
(i + Z r )th code block. For the both combining schemes, the outage event of the first transmission 
is the event in which the transmission data rate R is greater than the channel capacity Cj. For 
IR, the outage event of the second transmission is the event in which the transmission data rate 
R is greater than the sum of capacities of two channels Ci + Ci + i r . From now on, we will restrict 
ourselves to IR. 

Let C = E[hih*_ ld ], C = E[hih* +lr ] and C" = E[h^ k h* +lr }. Let be outage probability 
of the first transmission. Then, it is given as before, 

/S-M = l-ft(^,^). (39) 
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Let -Ff^ be outage probability of the second transmission. To compute it, we need to know the 
distribution of \h i+ i r \ 2 given h{ and hi_i d . From Lemma [T] given hi,hi_i d , 

h i+lr ~ CM{A{hi, h^f, 1 - AC* ), (40) 

where A = d^" 1 , d = tffc+^i, /i?_J] = (C",C"), and C 2 = E[{h h h^ ld ) T (h*, h*_i d )} = 

1 c] , 

. Hence, | given hi,hi-i d has noncentral chi-squared distribution with 2 degrees 

C 1 

of freedom, and its cdf is given as 

p{\ hl+l f < x\k hl . ld } = i- Ql ^ 2lA ^Acf ]2 > \/r5cf) • (41) 
Then > p iout is g iven as 

P UR) = ^ < fl) (i - Ql ( y^gS, ^/H^g))] , (42) 

where /(•) is indicator function. Without HARQ, the expected throughput can clearly be ex- 
pressed as TPi(R) = R(l — Pi fi ut{R)) as given in ©. With HARQ, however, there is no simple 
expression for the expected throughput. For example, let us consider the following expression 
as the expected throughput TP iH ARQ{R)- 

TP hHARQ {R) = R(l - Pl out {R)) + f (1 - PtoutiR))- (43) 

The above expression is incorrect because it does not have consideration of the throughput at the 
(i + l r )th code block with the first transmission being successful. Since the rate of the (i + l r )th 
code block is not determined at the ith code block, i.e., it is the future decision, the rate allocation 
problem becomes the stochastic control problem with infinite horizon which is usually solved 
by dynamic programming. |[T3l . Solvability of such problem needs to be carefully investigated, 
and it will not be discussed here since its scope becomes out of this paper. 

Lack of simple throughput expression means that the proposed scheme can only be sub-optimal 
if one sticks with some simple, approximate throughput expression. This problem, however, is 
not unique to the proposed scheme. Even if we consider an approach of predicting the channel, 
we would still have exactly the same problem. In fact, the approach of predicting channel 
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has another problem under HARQ. If we are interested in rate allocation with consideration of 
possible retransmission, then we must be able to somehow characterize P} out - With the approach 
of predicting channel, the notion of outage probability is unclear, which means that it would 
be hard to design such a scheme which works with any kind of HARQ throughput expression 
depending on P} ouV Note that the proposed scheme at least has a systematic way of determining 
this outage probability, and this could be significant advantage under HARQ. 

V. Conclusion 

In this paper, we have proposed an algorithm which maximizes long-term throughput for 
ergodic fading channel with delayed CSIT. For performance evaluation, we have considered 
AR(1) model with Rayleigh fading, and the proposed scheme considerably outperforms conven- 
tional ones in this model. Ergodicity conditions which need to be satisfied to ensure optimality 
of the proposed scheme are discussed, and numerical evaluation shows near optimality of the 
proposed scheme for AR(1) model. As long as channel has ergodicity, the proposed scheme 
must be optimal in terms of long-term throughput for any Rayleigh fading channel in which 
each channel symbol is correlated with finite number of channel symbols. Such ergodicity is 
generally hard to prove, but it is often assumed for various wireless channels as mentioned 
earlier. 

The proposed scheme can also be generalized to other than SISO block fading model. In 
SIMO case, the expected throughput involves generalized Marcum Q-function Qm(-) which 
possibly needs bounding analysis as given for Qi(-) in this paper. The proposed scheme can 
be applied to the case of fading within a code block as well, although inexistence of a closed 
from functional expression for cdf of current channel would make things more complicated. For 
the case of HARQ, the correct throughput expression must be determined before applying the 
proposed scheme, but it still can be used if one is interested in maintaining certain BLER instead 
of maximizing the throughput. 

Although we have exclusively considered Rayleigh fading model in this paper, the proposed 
scheme can be applied to any channel model given that joint channel statistics are known. It can 
be computationally difficult, however, to maximize the expected throughput over transmission rate 
R for general channel model. In Rayleigh fading model, the expected throughput is described 
in terms of well known Marcum Q-function, and such maximization is done with the aid of 
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previous studies in liteerature on Marcum Q-function. 
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